Learning to talk about events from narrated video in a construction grammar framework

نویسندگان

  • Peter Ford Dominey
  • Jean-David Boucher
چکیده

The current research presents a system that learns to understand object names, spatial relation terms and event descriptions from observing narrated action sequences. The system extracts meaning from observed visual scenes by exploiting perceptual primitives related to motion and contact in order to represent events and spatial relations as predicate-argument structures. Learning the mapping between sentences and the predicate-argument representations of the situations they describe results in the development of a small lexicon, and a structured set of sentence form-to-meaning mappings, or simplified grammatical constructions. The acquired grammatical construction knowledge generalizes, allowing the system to correctly understand new sentences not used in training. In the context of discourse, the grammatical constructions are used in the inverse sense to generate sentences from meanings, allowing the system to describe visual scenes that it perceives. In question and answer dialogs with naïve users the system exploits pragmatic cues in order to select grammatical constructions that are most relevant in the discourse structure. While the system embodies a number of limitations that are discussed, this research demonstrates how concepts borrowed from the construction grammar framework can aid in taking initial steps towards building systems that can acquire and produce event language through interaction with the world.  2005 Elsevier B.V. All rights reserved.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Learning Word Meaning And Grammatical Constructions From Narrated Video Events

The objective of this research is to develop a system for miniature language learning based on a minimum of prewired language-specific functionality, that is compatible with observations of perceptual and language capabilities in human development. In the proposed system, meaning is extracted from video images based on detection of physical contact and its parameters. Mapping of sentence form t...

متن کامل

Learning Grammatical Constructions in a Miniature Language from Narrated Video Events

The objective of this research is to develop a system for miniature language learning based on a minimum of prewired language-specific functionality, that is compatible with observations of perceptual and language capabilities in human development. In the proposed system, meaning is extracted from video images based on detection of physical contact and its parameters. Mapping of sentence form t...

متن کامل

Recognition of Visual Events using Spatio-Temporal Information of the Video Signal

Recognition of visual events as a video analysis task has become popular in machine learning community. While the traditional approaches for detection of video events have been used for a long time, the recently evolved deep learning based methods have revolutionized this area. They have enabled event recognition systems to achieve detection rates which were not reachable by traditional approac...

متن کامل

Reflective Teaching in the Context of a Video Club: Nurturing Professional Relationships and Building a Learner Community

The purpose of this study was to examine how four teachers used the seven processes of videotape analysis to develop an analytic approach and reflective thinking towards their teaching. The study was organized within video clubs and was used to describe the interactions among four teachers about their experiences at a language institute. Data were gathered through videotaped recordings of lesso...

متن کامل

Evaluation of Midwifery Student's Attitude, Performance and Satisfaction from teaching clinical skills with the Video in Hamedan School of Nursing and Midwifery (2019)

1. Duncan I, Yarwood-Ross  L, Haigh  C..YouTube as a source of clinical skills education. Nurse Eduction. .2013; 33 (12): 1576–1580 2. Arguel  ., Jamet  E. Using video and static pictures to improve learning of procedural contents.Comput. Hum. Behav.2008; 25 (2):354–359. 3. Johnson  N, List-Ivankovic  J, Eboh  W, Ireland  ., Adams  D, Mowatt  E, Martindale  S. Research and evidence based pra...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Artif. Intell.

دوره 167  شماره 

صفحات  -

تاریخ انتشار 2005